In this video I cover how to use PySpark with AWS Glue. Using the resources I have uploaded to GitHub we carryout a full tutorial on how to manipulate data and carry out ETL tasks within the AWS Glue Ecosystem. Don't worry if you are new to PySpark, AWS, or Glue I guide you through everything step by step. LINK TO GITHUB TUTORIAL RESOURCES: 💾 Code Repo: 📈 Slides: %20-%20PySpark%20For%20AWS% SUPPORT THE CHANNEL: ☕ Buy Me A Coffee: 🖥️ My VPN: 00:00 - Intro 00:46 - Set Up 08:41 - Run Our First PySpark Code - Read Up Data Using A DynamicFrame 10:13 - Spark And PySpark Theory 19:53 - DynamicFrame PrintSchema 22:29 - DynamicFrame Count 23:30 - DynamicFrame Select 27:49 - DynamicFrame Drop Fields 31:02 - DynamicFrame Change Field Name 37:31 - DynamicFrame Filtering 41:39 - DynamicFrame Joining 47:29 - DynamicFrame Write To S3 54:12 - DynamicFrame Write To Glue Data Catalog 58:55 - Spark DataFrame Theory 01:00:25 - Convert To A Spark DataFrame 01:02:49 - Spark DataFrame Select Columns 01:04:31 - Spark DataFrame Add Columns 01:11:06 - Spark DataFrame Drop Columns 01:14:11 - Spark DataFrame Group By And Aggregate 01:15:58 - Spark DataFrame Filter And Where Clause 01:18:58 - Spark DataFrame Joins 01:24:21 - Spark DataFrame Write 01:36:20 - Outro 01:36:32 - Channel Supporters Shout Out OTHER USEFUL LINKS: 📹 Glue Tutorial: ℹ️ My Website: 🔗 Linkedin: 😎 About me I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a career in data and computing. This journey culminated in the study of a Masters degree in Software Enjoy 🤘











